Next Article in Journal
Testing for Equality of Parameters from Different Load-Sharing Systems
Previous Article in Journal
One-Parameter Weibull-Type Distribution, Its Relative Entropy with Respect to Weibull and a Fractional Two-Parameter Exponential Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Tracking Poisson Parameter for Non-Stationary Discontinuous Time Series with Taylor’s Abnormal Fluctuation Scaling

by
Gen Sakoda
1,*,†,
Hideki Takayasu
2,3,† and
Misako Takayasu
1,3,†
1
Department of Mathematical and Computing Sciences, School of Computing, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama 226-8502, Japan
2
Sony Computer Science Laboratories, 3-14-13 Higashi-Gotanda, Shinagawa-ku, Tokyo 141-0022, Japan
3
Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama 226-8502, Japan
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Stats 2019, 2(1), 55-69; https://doi.org/10.3390/stats2010005
Submission received: 27 November 2018 / Revised: 20 January 2019 / Accepted: 21 January 2019 / Published: 23 January 2019

Abstract

:
We propose a parameter estimation method for non-stationary Poisson time series with the abnormal fluctuation scaling, known as Taylor’s law. By introducing the effect of Taylor’s fluctuation scaling into the State Space Model with the Particle Filter, the underlying Poisson parameter’s time evolution is estimated correctly from given non-stationary time series data with abnormally large fluctuations. We also developed a discontinuity detection method which enables tracking the Poisson parameter even for time series including sudden discontinuous jumps. As an example of application of this new general method, we analyzed Point-of-Sales data in convenience stores to estimate change of probability of purchase of commodities under fluctuating number of potential customers. The effectiveness of our method for Poisson time series with non-stationarity, large discontinuities and Taylor’s fluctuation scaling is verified by artificial and actual time series.

1. Introduction

The Poisson process is a basic stochastic process for events that occur at random in various natural and social phenomena, such as the number of decay of radioactive atoms, the occurrence of a failure of elements in devices, and daily sales amount of commodities [1]. As the Poisson process is fully described by just one parameter, λ , the mean value of events in a unit time, precise estimation of λ for a given data set is the key to understand the process.
Estimation of λ is an easy task if the data holds stationarity, as λ is given simply by the average number in a unit time. However, for any real system, we encounter the following two difficulties.
The first difficulty comes from non-stationarity. A wide variety of natural and social phenomena exhibit non-stationarity. The fluxes of cosmic radiation, such as active galactic nuclei and X-ray, are non-stationary on long time scales [2]. Insect population shows non-stationarity by the factors such as environmental changes and dynamics of the ecological system [3]. As for social phenomena, web page visitations, highway traffic [4] and the underlying demand for a commodity [1] changes non-stationarily by various uncontrollable factors like human activity, social trend and environmental condition. Thus, a time series in real systems generally shows complicated non-stationarity which is often unpredictable. Especially, in the case that the time scale of change of λ is short including discontinuous jumps, it is very difficult to estimate λ by the conventional methods such as the Poisson regression method [5].
The second difficulty relates to the abnormally large fluctuations whose standard deviation is proportional to λ , while the width of fluctuation of the Poisson process is known to be given by λ . Taylor found the power law [6] between the fluctuation and the mean value in observing the number of individuals in the ecosystem. This fluctuation scaling, known as Taylor’s law, has been confirmed in a variety of natural and social phenomena [7], such as biology [8], financial market [9], networks [10], frequencies of word appearances on internet blogs [11] and sales [1]. The mechanism of the fluctuation scaling is known as follows. The basic process follows a Poisson process with a constant λ , but there exists fluctuation in the population of the observing objects, that gives the standard deviation of fluctuation proportional to the mean value, λ [1,10,11,12]. This fluctuation scaling is unavoidable in many cases since the total number of the observing objects is practically uncontrollable in the real system. For precise estimation of λ we have to distinguish the variation of λ by this effect from a non-stationary temporal change of λ . However, Taylor’s fluctuation scaling has not been attracted attention in the Poisson parameter estimation.
Non-stationary parameter estimation methods have been widely studied, such as Auto Regressive Integrated Moving Average (ARIMA) [13], Neural Networks (NN) [14], which are suitable for non-stationarity with regularity characterized by autocorrelations. Generalized Additive Model (GAM) [15] can reveal the general trend of a non-stationary time series by fitting a smooth curve which is differentiable. The State Space Model (SSM) [16] is applicable for unpredictable non-stationarity with abrupt and indifferentiable changes, therefore in this paper we generalize this method to handle non-stationary Poisson time series with Taylor’s fluctuation law.
In Section 2, we explain our method with short reviews of the non-stationary time series analyses to be used in our method. In Section 3, we verify our method by some artificial Poisson time series. In Section 4, we examine the effectiveness for actual data, namely Point-of-Sales (POS) data which shows non-stationarity and the fluctuation scaling with unpredictable sudden jumps.

2. Methods

2.1. Non-Stationary Time Series Analysis

A time series analysis method suitable for non-stationary cases, the State Space Model (SSM) [16] is adopted in our method. In this subsection, we explain the SSM.
The SSM assumes that an observation value y t at time t is generated stochastically by an invisible state x t , where x t is stochastically determined depending on x t 1 .
x t Φ ( x t | x t 1 )
y t Φ ( y t | x t )
where Φ ( A | B ) denotes the Probability Density Function (PDF) of A conditioned by B, and the tilde (∼) in the formula, C Φ ( A | B ) denotes that C is a stochastic variable following the PDF.
Equation (1) is called the system model in the SSM, which describes stochastic time evolution of the state x t . Equation (2) is called the observation model in the SSM, which defines the stochastic generation of the observation value from the state.
Non-stationarity of the Poisson mean value λ can be modeled in the system model Equation (1), regarding λ as the state x t . We define the system model as follows:
x t = δ ( x t 1 + ν t )
δ ( x ) = x , if x > 0 0 , otherwise
ν t ( 1 m ) · N ( 0 , α · x t 1 ) + m · U ( β , β )
where m, α , β are hyper parameters. N ( 0 , α · x t 1 ) is the Normal distribution with the mean value equals to zero and the standard deviation equals to α · x t 1 ( 0 < α ). U ( β , β ) is the truncated Uniform distribution with the range from − β to β ( 0 < β ). m characterizes the ratio of superposition of two distributions, the Normal and truncated Uniform distributions ( 0 m 1 ). Equation (5) means that the stochastic time evolution of the change of v t generally follows either the Normal distribution with probability 1 m or the Uniform distribution with range of 2 · β with probability m. The value of β is determined as the range of the Uniform distribution is much larger than the fluctuation of the Normal distribution ( α · x t 1 ), that allows a large change of x t . The superposition of two distributions considers the case where x t generally follows the Normal distribution, but sometimes changes up to β with a small probability m. This kind of system model with long tail distribution allows better trackability of a non-stationarity. A system model defined by the superposition of a Normal distribution and a truncated Uniform distribution is already proposed by Yura et al. [17] for non-stationary time series analysis of financial market data. Here, in our model the value of x t is limited to be non-negative by Equation (4) as the Poisson parameter λ (or x t ) needs to be positive.
The hyper parameters are to be determined in accordance with non-stationarity of Poisson λ for each system. We adopted m = 0.05 , α = 0.005 and β = 2.5 · σ t 1 by examining Root Mean Squared Error of λ estimation of some artificial time series, such as stationary, step-like and continuous rising. σ t 1 is Taylor’s fluctuation term to be described in the next subsection, Equation (6). The detailed information on the hyper parameters determination is provided in the Appendix A.

2.2. Taylor’s Fluctuation Scaling Law

The standard deviation of a Poisson process with Taylor’s fluctuation scaling, which is caused by the fluctuation in the population of the observing objects, is generally given as follows [18]:
σ = λ + ( γ · λ ) 2
Here, the proportional constant γ characterizes the strength of fluctuation of population and its value is determined by examining the relationship between the standard deviation and the mean of an actual data.
As an example, we explain a case of Point-of-Sales (POS) data. The POS data is taken from 243 chain stores of a major convenience store company in Japan during the 153 days from 1 June to 31 October 2010. We obtained daily sales time series of each product in each store from the POS data.
Figure 1 illustrates Taylor’s fluctuation scaling of sales data of a product (a soft drink). The solid line indicates σ = λ + ( γ · λ ) 2 where γ = 0.12, the dot-dash line is σ = λ and the dashed line shows σ = γ · λ . Figure 1 is obtained by the following procedures.
  • For small mean values we prepare data sets of daily sales numbers of the product for each store with specification by the day of the week (For example, sales number in every Monday). 243 (stores) × 7 (days of the week) = 1701 points are plotted.
  • For larger mean values we prepare aggregated data sets by random sampling as follows. For each given number of stores, k, we choose k stores at random and the sales numbers are summed up for each day of the week. The value of k is from 2 to 243 and we repeat this procedure 30 times for each k (48,223 points are plotted.)
The γ value can be estimated by non-linear regression analysis [19]. Specifically, γ is determined by using the least-square method which minimize the difference between the σ obtained by Equation (6) and the actual σ in Figure 1.

2.3. SSM of Our Method

We incorporate Taylor’s fluctuation scaling term into the SSM. The observation model Equation (2) for the Poisson process can be written as follows:
y t P o ( x t )
where P o ( x t ) is the Poisson distribution with the mean value x t . The observation value y t are generated stochastically by the distribution. Here, x t is stochastically determined by Equations (3)–(5).
In order to incorporate Taylor’s fluctuation scaling term Equation (6) as the standard deviation of Equation (7), we approximate the Poisson distribution by the Normal distribution in the case that x t is statistically large (i.e., x t 20 ). The Normal distribution for x t = 10 and σ = 10 is very similar to the Poisson distribution for x t = 10 , and for x t 20 , the distributions are virtually indistinguishable [20]. We apply the approximation by the normal distribution for x t 20 since the skewness of the Poisson distribution of x t = 20 is 30 % smaller than that of x t = 10 , hence, closer to 0, which is the value of skewness of the normal distribution, whereas the ratio of change of the standard deviation by Taylor’s fluctuation scaling at x t < 20 is below 10 % , where γ = 0.1 .
y t N ( x t , x t + ( γ · x t ) 2 )
Thus, the likelihood of x t for Equation (7) and Equation (8) are written as PDF of each distribution,
p ( y t | x t ) = 1 2 π σ t 2 exp ( ( y t x t ) 2 2 σ t 2 ) , if x t 20 x t y t · exp ( x t ) y t ! , otherwise
where σ t = x t + ( γ · x t ) 2 .

2.4. Particle Filter

Our SSM contains non-Gaussian distribution such as the Poisson distribution and the truncated Uniform distribution. The Particle Filter algorithm [21,22] has the capability to solve such a non-Gaussian SSM.
The Particle Filter is a kind of Monte Carlo simulation which approximates a PDF of the state x t as the distribution of Monte Carlo sample values. Owing to the property of the Monte Carlo simulation, an arbitrary distribution function can be represented. The Monte Carlo samples are referred to as the particles in the Particle Filter. The distribution of the particles is updated by the observation value y t for each time t with the likelihood function. The set of the particles at time t conditioned by an observation value at time k is written in the form of { x t | k ( i ) } i = 1 N , where N is the Monte Carlo sample size. The specific procedure of the Particle Filter which corresponds to the simulation code is as follows,
  • Generate initial N particles { x 0 | 0 ( i ) } i = 1 N and set t = 1 .
  • At time t:
    (a)
    Generate prediction distribution of particles { x t | t 1 ( i ) } i = 1 N by Equations (3)–(5).
    (b)
    Using the observation value y t , estimate likelihood p ( y t | x t | t 1 ( i ) ) for each particle by Equation (9)
    (c)
    Update particles by resampling with the replacement of { x t | t 1 ( i ) } i = 1 N . The resampling probabilities ρ t ( i ) for each particle are proportional to the likelihood of each particle.
    ρ t ( i ) = p ( y t | x t | t 1 ( i ) ) / j = 1 N p ( y t | x t | t 1 ( j ) )
    (d)
    If time t is the last step, stop the procedure. Otherwise, increment t and go to step 2 (a).
In this research, the number of Monte Carlo samples is set to N = 10,000 and the initial particles are generated by Equations (3)–(5) assuming x 1 = y 0 for simplicity. We obtain a λ value for each time t by calculating the median, instead of mean, of the Monte Carlo samples of x t since its robustness against variations of particle values.
The Markov Chain Monte Carlo (MCMC) method [23] is another candidate to solve our SSM. We selected the Particle Filter since it can solve the SSM stably even with a small amount of data.

2.5. Particle Filter with Discontinuity Detection

Our SSM introduce a wide range of Uniform distribution to follow large parameter changes in time series as shown in Equation (5). Moreover, we consider the cases that include discontinuous jump points at which time series value changes more than the range of the Uniform distribution.
Simply widening the range of the Uniform distribution results in reducing the density of the particles and increasing Monte Carlo error. We propose the following discontinuity detection and estimation correction method.
  • Detect a discontinuous point by checking whether the observation value is out of the bound of the particle distribution.
  • If a discontinuous point is detected, initialize the particle distribution with the observation value at the discontinuous point.
Using the observation value y t itself at the discontinuous point as the median of the particle distribution leads to overfit the data. We adopt y t σ t or y t + σ t at the upward or downward discontinuous point respectively. Accordingly, the threshold of discontinuous point detection is set to be the upper bound + σ t or lower bound − σ t of the particle distribution.
We modify the algorithm of the Particle Filter shown in Section 2.4, step 2 (b) as follows,
  • Generate initial N particles { x 0 | 0 ( i ) } i = 1 N , set t = 1 .
  • At time t:
    (a)
    Generate prediction distribution of particles { x t | t 1 ( i ) } i = 1 N by Equations (3)–(5).
    (b)
    If the observation value y t is above the upper bound + σ t or below the lower bound σ t of the prediction distribution, go to step 1 and set x 0 to y t σ t or y t + σ t at the upward or downward discontinuous point respectively. Otherwise, using the observation value y t , estimate likelihood p ( y t | x t | t 1 ( i ) ) for each particle by Equation (9) and go to next step .
    (c)
    Update particles by resampling with the replacement of { x t | t 1 ( i ) } i = 1 N . The resampling probabilities ρ t ( i ) for each particle are proportional to the likelihood of each particle.
    ρ t ( i ) = p ( y t | x t | t 1 ( i ) ) / j = 1 N p ( y t | x t | t 1 ( i ) )
    (d)
    If time t is the last step, stop the procedure. Otherwise, increment t and go to step 2 (a).

2.6. Summary of the Parameter Estimation Procedure

We summarize the parameter estimation procedure. Steps 1 and 2 are preparations, and step 3 is the actual parameter estimation step.
  • Estimate the proportional constant γ of Taylor’s fluctuation scaling by some previous data, as explained in Section 2.2.
  • Determine the hyper parameters in the system model Equation (5) in SSM, by using some artificial time series as shown in Appendix.
  • Apply the Particle filter described in Section 2.4 and Section 2.5 to solve the SSM, and estimate Poisson λ parameter of a time series. Poisson λ parameter is estimated basically based on the likelihood for an observation data at each time t. When the discontinuous trend jump is detected as explained in Section 2.5, the value of λ is estimated based on the observation data. No extrapolation is done, namely, we do not estimate the parameter where there is no data.

3. Simulation Tests

In this section, we verify the validity of our method with artificial time series generated by random number simulation. The distribution of random numbers is the Poisson distribution for small λ , i.e., λ < 20 , otherwise the Normal distribution with Taylor’s fluctuation scaling term Equation (6) where γ = 0.1 . The γ value 0.1 is typical for the Point-of-Sales data used in this research [1]. Random numbers with a specific distribution can be obtained by the inversion method [24].

3.1. Validity for Non-Stationarity

First, we demonstrate our method without discontinuity point detection, which is described in Section 2.5. Figure 2a shows a continuous rising time series from λ = 20 to λ = 200 in gray line. In this figure, the assumed values of λ and the estimated values of λ are shown in dotted line and black line respectively. We can confirm that the estimated values of λ follows the non-stationary continuous rising trend of the assumed values of λ . The Root Mean Squared Error (RMSE) between the assumed values of λ ( λ A ) and the estimated values of λ ( λ E ), standardized by λ A at each time t ( 1 t N ) as shown in Equation (12), is 7.74 % .
R M S E λ = 1 N t = 1 N ( 1 λ E t λ A t ) 2
Figure 2b indicates a step-like time series which increases from λ = 20 to 200 in gray line. The estimated values of λ in black line generally follows the abrupt step-like change of the assumed values of λ , but there exists some delay. The assumed values of λ changes 20 to 200 at time 50, but the estimated values of λ reaches to 200 at about time 80. The R M S E λ by Equation (12) is 14.59 % .
This delay is caused by the large discontinuity of the time series which is not covered by the range of the prediction distribution, that is, the range of the truncated uniform distribution in Equation (5). By introducing the discontinuous point detection as mentioned in Section 2.5, the delay is corrected as shown in Figure 3. The R M S E λ is improved to be 6.55 % .

3.2. Validity for Taylor’s Fluctuation Scaling

Figure 4a shows a step-like time series from λ = 10 to 640 increased by two times every 30 data points which is shown in gray line, and the estimated values of λ in black line. The R M S E λ is 8.66 % .
To verify the validity of the estimated values of λ , we examined the deviation of time series values from the estimated values of λ . If the estimated values are reasonable, the relationship between the standard deviation calculated with time series and λ should follow Equation (6). The specific procedure for calculating the standard deviation and the mean is as follows,
  • Make pairs of estimated values of λ and time series value for each time t.
  • Divide the pairs into groups by exponential bins such as [ 1 , 2 ] , [ 2 , 2 2 ] , [ 2 2 , 2 3 ] , for estimated values of λ values.
  • Calculate the standard deviation with the time series value and estimated values of λ for each group. Obtain the mean with the estimated values of λ for each group.
The result is shown in Figure 4b. The plots are along the relationship of Taylor’s fluctuation scaling illustrated in black solid line. This ensures the validity of our model for Taylor’s fluctuation scaling. The RMSE between the theoretical standard deviation ( σ T ) by Equation (6) and the estimated standard deviation ( σ E ), standardized by σ T at each time t ( 1 t N ) as shown in Equation (13), is 8.71 % .
R M S E σ = 1 N t = 1 N ( 1 σ E t σ T t ) 2
To clarify the advantage of our model to the conventional Poisson model, we performed an λ estimation of the same time series as in Figure 4a with the conventional Poisson model, that is, with Equation (7) instead of Equation (8). The results are shown in Figure 4c,d. One can see that the estimated values of λ time series in Figure 4c follows the fluctuation of the time series excessively. The conventional Poisson model regards large fluctuation caused by Taylor’s fluctuation scaling as variation of λ . Figure 4d illustrates that the Poisson model estimates the fluctuation as λ in dot-dash line, while the true fluctuation is Equation (6) in solid line. The R M S E σ is 33.73 % .
Along with the step-like time series, we examined the case of a continuous rising time series. Figure 5a,b are the estimated results by our method with Taylor’s fluctuation scaling term. The R M S E σ is 8.65 % . Figure 5c,d are the results by the conventional Poisson model. The R M S E σ is 31.71 % . These results also demonstrate the effectiveness of our method for estimating the Poisson λ with Taylor’s fluctuation scaling.

4. Point-of-Sales (POS) Data Tests

We verified our method by actual data, namely, the POS data described in Section 2.2. The POS data is a typical example of the non-stationary Poisson time series with Taylor’s fluctuation scaling [1].
Figure 6a shows a daily sales number series of a product (an ice cream) at a store. One can see the arch-shaped trend in the sales time series in gray line, and the estimated values of λ in black line follows the trend. Figure 6b clarifies that the relationship between the mean and the standard deviation of the estimated values of λ values are along the relationship derived from Taylor’s fluctuation scaling Equation (6). The R M S E σ is 16.56 % .
Figure 6c,d show the estimation results of the same time series in Figure 6a, with the conventional Poisson model. The Poisson model results in overfitting of the λ to the time series as shown in Section 3.2. The R M S E σ is 35.40 % . The advantage of incorporating the term of Taylor’s fluctuation scaling in precise estimation of the Poisson parameter is confirmed with this actual data.
Figure 7a indicates a sales time series of a product (a meat bun) of a store which shows sudden rises at around time 10 and 52. The estimated values of λ in black line well follows the sudden rises. Figure 7b shows the estimated values of λ are along the relationship of Taylor’s fluctuation scaling. The R M S E σ is 18.74 % .
Figure 7c,d show the estimation results of the same time series in Figure 7a, without using the correction method at discontinuous points. We can find that the estimated values of λ in black line are underfitted to the time series in gray line at the large discontinuous points. The R M S E σ is 337.62 % . Thus, the merit of the correction method at discontinuous points is verified with the actual time series.

5. Conclusions

The effectiveness of our method of tracking the Poisson parameter for abnormally large fluctuation time series with non-stationarity is verified one by one with artificial time series. We revealed the advantage of our method considering the term of Taylor’s fluctuation scaling to the conventional Poisson model, namely, our model distinguishes non-stationarity of the parameter from the large fluctuation caused by this fluctuation scaling. We also presented the method of detecting discontinuity of the time series. By correcting estimation values at the discontinuous point in time series, we realized precise parameter estimation with non-stationary time series with large discontinuity. Along with some artificial time series, we verified the effectiveness of our method by some actual Point-of-Sales data which shows non-stationarity, large discontinuities and Taylor’s fluctuation scaling.
We assume the case that Taylor’s fluctuation scaling is generally caused by the fluctuation in the population of the observing objects, which leads the standard deviation to be proportional to Poisson mean value λ , namely, the scaling exponent becomes 1. The scaling exponent 1 is observed in many natural and social phenomena, such as cell numbers of species [25], fluxes of cosmic radiation [2], daily water-level of a river [26], web page visitations [4], highway traffic [4] and sales [1].
Considering effects other than fluctuation in the population, there are cases that the Taylor’s scaling exponent takes a value between 0.5 and 1.0 as we can find examples in POS data on stamps and magazines [1]. Stamps are often purchased in bulk, and magazines are specifically bought on the launch day. The purchase processes of these items are not regarded as a simple Poisson process, that leads to the scaling exponent under 1. The extension of the proposed model for more general Taylor’s fluctuation scaling is a potential future research task.
Periodicity in the time series, such as seasonality and circadian variation behind the daily data, are not considered in the present model. In practical applications in the future, applying the method to some cases such as trend forecast, an extension of our method taking into account such periodicity may yield better accuracy of the parameter estimation.
Our method deals with parameter estimation of a non-stationary Poisson time series of number of events observing with regular intervals in the case that the population is also changing randomly. Meanwhile, there exist methods based on Poisson point process suitable for describing events in discrete points in time. The Spatial Mixed Poisson Process(SMPP) [27,28,29,30,31] is such an example, which can handle the case that the underlying random event is correlated to the observing event that is stochastically determined. The SMPP can model the delay of the impact from the underlying event to the observing event since both events are described separately in discrete points in time, which is suitable for application to rare but correlated events such as occurrence of cascading crashes in financial markets. There are wide variety of phenomena both in nature and in our society, which are considered to be described basically by a variant of Poisson process. More different approaches will appear in the near future to cope with the increasing big data in the society.

Author Contributions

Conceptualization, G.S., H.T. and M.T.; Data curation, G.S.; Formal analysis, G.S.; Funding acquisition, M.T.; Investigation, G.S.; Methodology, G.S., H.T. and M.T.; Project administration, H.T. and M.T.; Resources, M.T.; Software, G.S.; Supervision, H.T. and M.T.; Validation, G.S.; Visualization, G.S.; Writing—original draft, G.S.; Writing—review and editing, H.T. and M.T.

Funding

This work was partially supported by JSPS KAKENHI (Grant No.18H01656) and JST, Strategic International Collaborative Research Program (SICORP) on the topic of ”ICT for a Resilient Society” by Japan and Israel.

Acknowledgments

The authors would like to thank Seven-Eleven Japan Co.,Ltd. for providing the POS data.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Estimation of the Hyper Parameters

Equation (5) contains three hyper parameters, m, α and β . We determined the value of each parameter by the following procedure. We assumed three types of time series, stationary as shown in Figure A1a, discontinuous step-like change as illustrated in Figure A2a and continuous rise as indicated in Figure A3a. We generated 100 sets of random number time series for each three types by random number simulation mentioned in Section 3. We estimate λ time series for each time series and examined Root Mean Squared Error (RMSE) of estimated and assumed values of λ by changing m, α , β . The proportional constant of the fluctuation scaling γ is assumed 0.1, which is typical for the Point-of-Sales data used in this research [1]. The β is assumed in the form of n · σ t 1 , where σ t 1 is Taylor’s fluctuation scaling term Equation (6), meaning that the range of the Uniform distribution in Equation (5) is determined based on the variance by Equation (6) at each λ .
The results for stationary time series is shown in Figure A1b–d. One can see that small RMSE for stationary time series is obtained when the value of each parameter, m, α and β are small. It is reasonable because small m, α and β lead to a narrow prediction distribution and less sensitivity to variations in a stationary time series.
The results for step-like change shown in Figure A2 are almost opposite parameter dependencies to Figure A1 since a wide prediction distribution contributes to sensitiveness to the change. One should note that a prediction distribution wider than necessary results in large RMSE because of the Monte Carlo error. The results for continuous rise in Figure A3 shows generally similar parameters dependence as Figure A2, in the cases of too small or too large m, α and β result in large RMSE.
We adopted m = 0.05 , α = 0.005 and β = 2.5 · σ t 1 , marked with yellow rectangles in Figure A1, Figure A2 and Figure A3 , since these are one of the balanced condition giving small RMSE for three types of time series examined.
Figure A1. RMSE result of stationary time series. (a) An example of the stationary time series. Gray line: artificial time series, Black dotted line: assumed values of λ . (b) m = 0.05, where m is the ratio of the Uniform distribution to the Normal distribution, (c) m = 0.10, (d) m = 0.15. The yellow square shows the parameter sets we adopted in our simulation and real data analysis.
Figure A1. RMSE result of stationary time series. (a) An example of the stationary time series. Gray line: artificial time series, Black dotted line: assumed values of λ . (b) m = 0.05, where m is the ratio of the Uniform distribution to the Normal distribution, (c) m = 0.10, (d) m = 0.15. The yellow square shows the parameter sets we adopted in our simulation and real data analysis.
Stats 02 00005 g0a1
Figure A2. RMSE result of non-stationary, step-like time series. (a) An example of the step-like time series. Gray line: artificial time series, Black dotted line: assumed values of λ . (b) m = 0.05, (c) m = 0.10, (d) m = 0.15. The yellow square shows the parameter sets we adopted in our simulation and real data analysis.
Figure A2. RMSE result of non-stationary, step-like time series. (a) An example of the step-like time series. Gray line: artificial time series, Black dotted line: assumed values of λ . (b) m = 0.05, (c) m = 0.10, (d) m = 0.15. The yellow square shows the parameter sets we adopted in our simulation and real data analysis.
Stats 02 00005 g0a2
Figure A3. RMSE result of non-stationary, continuous rising time series. (a) An example of the continuous rising time series. Gray line: artificial time series, Black dotted line: assumed values of λ . (b) m = 0.05, (c) m = 0.10, (d) m = 0.15. The yellow square shows the parameter sets we adopted in our simulation and real data analysis.
Figure A3. RMSE result of non-stationary, continuous rising time series. (a) An example of the continuous rising time series. Gray line: artificial time series, Black dotted line: assumed values of λ . (b) m = 0.05, (c) m = 0.10, (d) m = 0.15. The yellow square shows the parameter sets we adopted in our simulation and real data analysis.
Stats 02 00005 g0a3
The hyper parameters in the SSM can be adjusted by using self-organizing state space model [32]. The proposed method is for tracking a non-stationary change even at the initial stage of a time series (i.e., t < 10 ) where sample size for the adjustment is not enough. For the stability of the estimation, the value of each hyper parameters is determined as described above.

References

  1. Fukunaga, G.; Takayasu, H.; Takayasu, M. Property of Fluctuations of Sales Quantities by Product Category in Convenience Stores. PLoS ONE 2016, 11, e0157653. [Google Scholar] [CrossRef] [PubMed]
  2. Azevedo, R.B.R.; Leroi, A.M. The flux-dependent amplitude of broadband noise variability in X-ray binaries and active galaxies. Mon. Not. R. Astron. Soc. 2001, 323, L26–L30. [Google Scholar]
  3. Turchin, P.; Taylor, A.D. Complex Dynamics in Ecological Time Series. Ecology 1992, 73, 289–305. [Google Scholar] [CrossRef] [Green Version]
  4. De Menezes, M.A.; Barabási, A.-L. Separating Internal and External Dynamics of Complex Systems. Phys. Rev. Lett. 2004, 93, 068701. [Google Scholar] [CrossRef] [PubMed]
  5. Winkelmann, R. Econometric Analysis of Count Data; Springer: Berlin/Heidelberg, Germany, 2008; pp. 63–126. [Google Scholar]
  6. Taylor, L.R. Aggregation, variance and the mean. Nature 1961, 189, 732–735. [Google Scholar] [CrossRef]
  7. Eisler, Z.; Bartos, I.; Kertész, J. Fluctuation scaling in complex systems: Tayolr’s law and beyond. Adv. Phys. 2008, 57, 89. [Google Scholar] [CrossRef]
  8. Kendal, W.S. A frequency distribution for the number of hematogenous organ metastases. J. Theor. Biol. 2002, 217, 203–218. [Google Scholar] [CrossRef]
  9. Eisler, Z.; Kertész, J. Scaling theory of temporal correlations and size-dependent fluctuations in the traded value of stocks. Phys. Rev. E. 2006, 73, 046109. [Google Scholar] [CrossRef]
  10. De Menezes, M.A.; Barabási, A.-L. Fluctuations in network dynamics. Phys. Rev. Lett. 2004, 92, 028701. [Google Scholar] [CrossRef]
  11. Sano, Y.; Yamada, K.; Watanabe, H.; Takayasu, H.; Takayasu, M. Empirical analysis of collective human behavior for extraordinary events in the blogosphere. Phys. Rev. E. 2013, 87, 012805. [Google Scholar] [CrossRef]
  12. Meloni, S.; Gomez-Gardeñes, J.; Latora, V.; Moreno, Y. Scaling Breakdown in Flow Fluctuations on Complex Networks. Phys. Rev. Lett. 2008, 100, 208701. [Google Scholar] [CrossRef] [PubMed]
  13. Box, G.; Jenkins, G. Time Series Analysis: Forecasting and Control; Holden-Day: San Francisco, CA, USA, 1976. [Google Scholar]
  14. Zhang, G.P. Neural Networks for Time-Series Forecasting, Handbook of Natural Computing; Springer: Berlin/Heidelberg, Germany, 2012; pp. 461–477. [Google Scholar]
  15. Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman and Hall: London, UK, 1990. [Google Scholar]
  16. Harvey, A. Forecasting with Unobserved Components Time Series Models, Handbook of Economic Forecasting; Elsevier: Amsterdam, The Netherlands, 2006. [Google Scholar]
  17. Yura, Y.; Takayasu, H.; Nakamura, K.; Takayasu, M. Rapid Detection of the Switching Point in a Financial Market Structure Using the Particle Filter. J. Stat. Comput. Simul. 2014, 84, 2073–2090. [Google Scholar] [CrossRef]
  18. Watanabe, H.; Sano, Y.; Takayasu, H.; Takayasu, M. Statistical properties of fluctuations of time series representing appearances of words in nationwide blog data and their applications: An example of modeling fluctuation scalings of nonstationary time series. Phys. Rev. E 2016, 94, 052317. [Google Scholar] [CrossRef]
  19. Draper, N.R.; Smith, H. Applied Regression Analysis; Wiley: Hoboken, NJ, USA, 1998; pp. 505–565. [Google Scholar]
  20. Cherry, S.R.; Sorenson, J.A.; Phelps, M.E. Physics in Nuclear Medicine, 4th ed.; Elsevier: Amsterdam, The Netherlands, 2012; pp. 126–128. [Google Scholar]
  21. Kitagawa, G. Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models. J. Comput. Graph. Stat. 1996, 5, 1–25. [Google Scholar]
  22. Gordon, N.J.; Salmond, D.J.; Smith, A.F. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F-Radar Signal Process. 1993, 140, 107–113. [Google Scholar] [CrossRef]
  23. Liu, J.S. Monte Carlo Strategies in Scientific Computing; Springer: Berlin, Germany, 2001. [Google Scholar]
  24. Luc, D. Non-Uniform Random Variate Generation; Springer: Berlin/Heidelberg, Germany, 1986; pp. 27–36. [Google Scholar]
  25. Azevedo, R.B.R.; Leroi, A.M. A power law for cells. Proc. Natl. Acad. Sci. USA 2001, 98, 5699–5704. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Jánosi, I.M.; Gallas, J.A.C. Growth of companies and water-level uctuations of the river Danube. Physica A 1999, 271, 448–457. [Google Scholar] [CrossRef]
  27. Cerqueti, R.; Fenga LVentura, M. Does the U.S. exercise contagion on Italy? A theoretical model and empirical evidence. Physica A 2018, 499, 436–442. [Google Scholar] [CrossRef]
  28. Cerqueti, R.; Ventura, M. Risk and Uncertainty in the Patent Race: A Probabilistic Model. IMA J. Manag. Math. 2015, 26, 39–62. [Google Scholar] [CrossRef]
  29. Cerqueti, R.; Ventura, M. Patent Valuation Under Spatial Point Processes with Delayed and Decreasing Jump Intensity. BE J. Theor. Econ. 2015, 15, 433–456. [Google Scholar] [CrossRef]
  30. Cerqueti, R. Exhaustion of Resources: A Marked Temporal Process Framework. Stoch. Environ. Res. Risk Assess. 2014, 28, 1023–1033. [Google Scholar] [CrossRef]
  31. Cerqueti, R.; Foschi, R.; Spizzichino, F. A Spatial Mixed Poisson Framework for Combination of Excess of Loss and Proportional Reinsurance Contracts. Insur. Math. Econ. 2009, 45, 59–64. [Google Scholar] [CrossRef]
  32. Kitagawa, G. A self-organizing state-space model. J. Am. Stat. Assoc. 1998, 93, 1203–1215. [Google Scholar] [CrossRef]
Figure 1. Example of Taylor’s fluctuation scaling for daily sales number of a soft-drink in convenience stores. The mean values and the standard deviations are plotted for various data sets with different number of aggregated stores (49,924 points are plotted). Dot-dash line: σ = λ , Dashed line: σ = γ · λ , Solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.12.
Figure 1. Example of Taylor’s fluctuation scaling for daily sales number of a soft-drink in convenience stores. The mean values and the standard deviations are plotted for various data sets with different number of aggregated stores (49,924 points are plotted). Dot-dash line: σ = λ , Dashed line: σ = γ · λ , Solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.12.
Stats 02 00005 g001
Figure 2. Estimation results of two types of artificial non-stationary time series using the particle filter without discontinuous detection shown in Section 2.4. (a) Continuous rise time series from λ = 20 to 200. (b) Step-like time series from λ = 20 to 200. Gray line: artificial time series, Black dotted line: assumed values of λ , Black solid line: estimated values of λ .
Figure 2. Estimation results of two types of artificial non-stationary time series using the particle filter without discontinuous detection shown in Section 2.4. (a) Continuous rise time series from λ = 20 to 200. (b) Step-like time series from λ = 20 to 200. Gray line: artificial time series, Black dotted line: assumed values of λ , Black solid line: estimated values of λ .
Stats 02 00005 g002
Figure 3. Correction of estimated values of λ at the discontinuous point using the particle filter with discontinuous detection shown in Section 2.5. Gray line: artificial time series, Black dotted line: assumed values of λ , Black solid line: estimated values of λ .
Figure 3. Correction of estimated values of λ at the discontinuous point using the particle filter with discontinuous detection shown in Section 2.5. Gray line: artificial time series, Black dotted line: assumed values of λ , Black solid line: estimated values of λ .
Stats 02 00005 g003
Figure 4. Verification for Taylor’s fluctuation scaling with step-like time series from λ = 10 to 640. (a) Black solid line: estimated values of λ with our model, Gray line: artificial time series, Black dotted line: assumed values of λ (b) The mean and the standard deviation plots of (a). (c) Black solid line: estimated values of λ with conventional Poisson model, Gray line: artificial time series, Black dotted line: assumed values of λ . (d) The mean and the standard deviation plots of (c). In (b,d), Dot-dash line: σ = λ , Dashed line: σ = γ · λ , Solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.1 .
Figure 4. Verification for Taylor’s fluctuation scaling with step-like time series from λ = 10 to 640. (a) Black solid line: estimated values of λ with our model, Gray line: artificial time series, Black dotted line: assumed values of λ (b) The mean and the standard deviation plots of (a). (c) Black solid line: estimated values of λ with conventional Poisson model, Gray line: artificial time series, Black dotted line: assumed values of λ . (d) The mean and the standard deviation plots of (c). In (b,d), Dot-dash line: σ = λ , Dashed line: σ = γ · λ , Solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.1 .
Stats 02 00005 g004
Figure 5. Verification for Taylor’s fluctuation scaling with continuous rising time series from λ = 10 to 600. (a) Black solid line: estimated values of λ with our model, Gray line: artificial time series, Black dotted line: assumed values of λ . (b) The mean and the standard deviation plots of (a). (c) Black solid line: estimated values of λ with conventional Poisson model, Gray line: artificial time series, Black dotted line: assumed values of λ . (d) The mean and the standard deviation plots of (c). In (b,d), dot-dash line: σ = λ , dashed line: σ = γ · λ , solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.1 .
Figure 5. Verification for Taylor’s fluctuation scaling with continuous rising time series from λ = 10 to 600. (a) Black solid line: estimated values of λ with our model, Gray line: artificial time series, Black dotted line: assumed values of λ . (b) The mean and the standard deviation plots of (a). (c) Black solid line: estimated values of λ with conventional Poisson model, Gray line: artificial time series, Black dotted line: assumed values of λ . (d) The mean and the standard deviation plots of (c). In (b,d), dot-dash line: σ = λ , dashed line: σ = γ · λ , solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.1 .
Stats 02 00005 g005
Figure 6. Estimation results of an arch-shaped sales time series. (a) Gray line: sales time series, Black line: estimated values of λ with our model. (b) The mean and the standard deviation plots of (a). (c) Gray line: sales time series, Black line: estimated values of λ with conventional Poisson model. (d) The mean and the standard deviation plots of (c). In (b,d), dot-dash line: σ = λ , dashed line: σ = γ · λ , solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.291 (an ice cream) [1].
Figure 6. Estimation results of an arch-shaped sales time series. (a) Gray line: sales time series, Black line: estimated values of λ with our model. (b) The mean and the standard deviation plots of (a). (c) Gray line: sales time series, Black line: estimated values of λ with conventional Poisson model. (d) The mean and the standard deviation plots of (c). In (b,d), dot-dash line: σ = λ , dashed line: σ = γ · λ , solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.291 (an ice cream) [1].
Stats 02 00005 g006
Figure 7. Estimation results of a sales time series with discontinuity. (a) Gray line: sales time series, Black line: estimated values of λ with our model. (b) The mean and the standard deviation plots of (a). (c) Gray line: sales time series, Black line: estimated values of λ with our model without the correction method at discontinuous points. (d) The mean and the standard deviation plots of (c). In (b,d), dot-dash line: σ = λ , dashed line: σ = γ · λ , solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.12 (a fried item) [1].
Figure 7. Estimation results of a sales time series with discontinuity. (a) Gray line: sales time series, Black line: estimated values of λ with our model. (b) The mean and the standard deviation plots of (a). (c) Gray line: sales time series, Black line: estimated values of λ with our model without the correction method at discontinuous points. (d) The mean and the standard deviation plots of (c). In (b,d), dot-dash line: σ = λ , dashed line: σ = γ · λ , solid line: σ = λ + ( γ · λ ) 2 , where γ = 0.12 (a fried item) [1].
Stats 02 00005 g007

Share and Cite

MDPI and ACS Style

Sakoda, G.; Takayasu, H.; Takayasu, M. Tracking Poisson Parameter for Non-Stationary Discontinuous Time Series with Taylor’s Abnormal Fluctuation Scaling. Stats 2019, 2, 55-69. https://doi.org/10.3390/stats2010005

AMA Style

Sakoda G, Takayasu H, Takayasu M. Tracking Poisson Parameter for Non-Stationary Discontinuous Time Series with Taylor’s Abnormal Fluctuation Scaling. Stats. 2019; 2(1):55-69. https://doi.org/10.3390/stats2010005

Chicago/Turabian Style

Sakoda, Gen, Hideki Takayasu, and Misako Takayasu. 2019. "Tracking Poisson Parameter for Non-Stationary Discontinuous Time Series with Taylor’s Abnormal Fluctuation Scaling" Stats 2, no. 1: 55-69. https://doi.org/10.3390/stats2010005

Article Metrics

Back to TopTop